Search CORE

88 research outputs found

F?D: On understanding the role of deep feature spaces on face generation evaluation

Author: Balakrishnan Guha
Kabra Krish
Publication venue
Publication date: 31/05/2023
Field of study

Perceptual metrics, like the Fr\'echet Inception Distance (FID), are widely used to assess the similarity between synthetically generated and ground truth (real) images. The key idea behind these metrics is to compute errors in a deep feature space that captures perceptually and semantically rich image features. Despite their popularity, the effect that different deep features and their design choices have on a perceptual metric has not been well studied. In this work, we perform a causal analysis linking differences in semantic attributes and distortions between face image distributions to Fr\'echet distances (FD) using several popular deep feature spaces. A key component of our analysis is the creation of synthetic counterfactual faces using deep face generators. Our experiments show that the FD is heavily influenced by its feature space's training dataset and objective function. For example, FD using features extracted from ImageNet-trained models heavily emphasize hats over regions like the eyes and mouth. Moreover, FD using features from a face gender classifier emphasize hair length more than distances in an identity (recognition) feature space. Finally, we evaluate several popular face generation models across feature spaces and find that StyleGAN2 consistently ranks higher than other face generators, except with respect to identity (recognition) features. This suggests the need for considering multiple feature spaces when evaluating generative models and using feature spaces that are tuned to nuances of the domain of interest.Comment: Code and dataset to be released soo

arXiv.org e-Print Archive

Correlators of Mixed Symmetry Operators in Defect CFTs

Author: Guha Sunny
Nagaraj Balakrishnan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

We use the embedding formalism to study correlation functions of a d-dimensional Euclidean CFT in the presence of a

q

co-dimensional defect. The defect breaks the global conformal group

SO(d+1,1)

into

SO(d-q+1,1) \times SO(q)

. We calculate all possible invariant structures that can appear in one-point, two-point and three-point correlation functions of bulk and defect operators in mixed symmetry representation. Their generalization to n-point correlation functions are also worked out. Correlation functions in the presence of a defect, in arbitrary representation of

SO(q)

, are also calculated.Comment: 39 pages, 3 figures v2: published version. Corrected typos and results from section 4.3 of v

arXiv.org e-Print Archive

Directory of Open Access Journals

An Unsupervised Learning Model for Deformable Medical Image Registration

Author: Balakrishnan Guha
Dalca Adrian V.
Guttag John
Sabuncu Mert R.
Zhao Amy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/04/2018
Field of study

We present a fast learning-based algorithm for deformable, pairwise 3D medical image registration. Current registration methods optimize an objective function independently for each pair of images, which can be time-consuming for large data. We define registration as a parametric function, and optimize its parameters given a set of images from a collection of interest. Given a new pair of scans, we can quickly compute a registration field by directly evaluating the function using the learned parameters. We model this function using a convolutional neural network (CNN), and use a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field. The proposed method does not require supervised information such as ground truth registration fields or anatomical landmarks. We demonstrate registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice. Our method promises to significantly speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is available at https://github.com/balakg/voxelmorph .Comment: 9 pages, in CVPR 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Visualizing chest X-ray dataset biases using GANs

Author: Balakrishnan Guha
Liang Hao
Ni Kevin
Publication venue
Publication date: 05/09/2023
Field of study

Recent work demonstrates that images from various chest X-ray datasets contain visual features that are strongly correlated with protected demographic attributes like race and gender. This finding raises issues of fairness, since some of these factors may be used by downstream algorithms for clinical predictions. In this work, we propose a framework, using generative adversarial networks (GANs), to visualize what features are most different between X-rays belonging to two demographic subgroups.Comment: Medical Imaging with Deep Learning(MIDL) 202

arXiv.org e-Print Archive

MadEye: Boosting Live Video Analytics Accuracy with Adaptive Camera Configurations

Author: Balakrishnan Guha
Netravali Ravi
Ramanujam Murali
Wong Mike
Publication venue
Publication date: 04/04/2023
Field of study

Camera orientations (i.e., rotation and zoom) govern the content that a camera captures in a given scene, which in turn heavily influences the accuracy of live video analytics pipelines. However, existing analytics approaches leave this crucial adaptation knob untouched, instead opting to only alter the way that captured images from fixed orientations are encoded, streamed, and analyzed. We present MadEye, a camera-server system that automatically and continually adapts orientations to maximize accuracy for the workload and resource constraints at hand. To realize this using commodity pan-tilt-zoom (PTZ) cameras, MadEye embeds (1) a search algorithm that rapidly explores the massive space of orientations to identify a fruitful subset at each time, and (2) a novel knowledge distillation strategy to efficiently (with only camera resources) select the ones that maximize workload accuracy. Experiments on diverse workloads show that MadEye boosts accuracy by 2.9-25.7% for the same resource usage, or achieves the same accuracy with 2-3.7x lower resource costs.Comment: 19 pages, 16 figure

arXiv.org e-Print Archive